Ontology design patterns to disambiguate relations between genes and gene products in GENIA
نویسندگان
چکیده
MOTIVATION Annotated reference corpora play an important role in biomedical information extraction. A semantic annotation of the natural language texts in these reference corpora using formal ontologies is challenging due to the inherent ambiguity of natural language. The provision of formal definitions and axioms for semantic annotations offers the means for ensuring consistency as well as enables the development of verifiable annotation guidelines. Consistent semantic annotations facilitate the automatic discovery of new information through deductive inferences. RESULTS We provide a formal characterization of the relations used in the recent GENIA corpus annotations. For this purpose, we both select existing axiom systems based on the desired properties of the relations within the domain and develop new axioms for several relations. To apply this ontology of relations to the semantic annotation of text corpora, we implement two ontology design patterns. In addition, we provide a software application to convert annotated GENIA abstracts into OWL ontologies by combining both the ontology of relations and the design patterns. As a result, the GENIA abstracts become available as OWL ontologies and are amenable for automated verification, deductive inferences and other knowledge-based applications. AVAILABILITY Documentation, implementation and examples are available from http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA/.
منابع مشابه
Applying ontology design patterns to the implementation of relations in GENIA
Motivation: Annotated reference corpora such as the GENIA corpus play an important role in biomedical information extraction. A semantic annotation of the natural language texts in these reference corpora using formal ontologies and logic is challenging due to the ambiguous use of natural language and natural language semantics. Providing formal definitions and axioms for these relations would ...
متن کاملIdentification and prioritization genes related to Hypercholesterolemia QTLs using gene ontology and protein interaction networks
Gene identification represents the first step to a better understanding of the physiological role of the underlying protein and disease pathways, which in turn serves as a starting point for developing therapeutic interventions. Familial hypercholesterolemia is a hereditary metabolic disorder characterized by high low-density lipoprotein cholesterol levels. Hypercholesterolemia is a quantitativ...
متن کاملUnsupervised Learning of Semantic Relations between Concepts of a Molecular Biology Ontology
We present an unsupervised model for learning arbitrary relations between the concepts defined in a molecular biology ontology for the purpose of text data mining and support to manual ontology building. Relations are learned from the GENIA corpus, in which named-entities representing the GENIA ontology concepts have been tagged, by means of several natural language processing techniques. We ca...
متن کاملEnhancing a Biological Concept Ontology to Fuzzy Relational Ontology with Relations Mined from Text
In this paper we investigate the problem of enriching an existing biological concept ontology into a fuzzy relational ontology structure using generic biological relations and their strengths mined from tagged biological text documents. Though biological relations in a text are defined between a pair of entities, the entities are usually tagged by their concept names in a tagged corpus. Since t...
متن کاملxGENIA: A comprehensive OWL ontology based on the GENIA corpus
UNLABELLED The GENIA ontology is a taxonomy that was developed as a result of manual annotation of a subset of MEDLINE, the GENIA corpus. Both the ontology and corpus have been used as a benchmark to test and develop biological information extraction tools. Recent work shows, however, that there is a demand for a more comprehensive ontology that would go along with the corpus. We propose a comp...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 2 شماره
صفحات -
تاریخ انتشار 2011